Chi-Square Tests for Comparing Weighted Histograms 



Abstract 

Weighted histograms in Monte Carlo simulations are often used for the es- 
timation of probability density functions. They are obtained as a result of 
random experiments with random events that have weights. In this paper, 
the bin contents of a weighted histogram are considered as a sum of random 
variables with a random number of terms. Generalizations of the classical 
chi-square test for comparing weighted histograms are proposed. Numeri- 
cal examples illustrate an application of the tests for the histograms with 
different statistics of events and different weighted functions. The proposed 
tests can be used for the comparison of experimental data histograms with 
simulated data histograms, as well as for the two simulated data histograms. 

Key words: homogeneity test, random sum of random variable, fit Monte 
Carlo distribution to data, comparison experimental and simulated data. 



1. Introduction 

A histogram with m bins for a given probability density function p(x) is 
used to estimate the probabilities pi that a random event belongs in bin i: 



J Si 

Integration in ([T]) is carried out over the bin Si and Pi = 1. A histogram 
can be obtained as a result of a random experiment with the probability 
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density function p(x). 

A frequently used technique in data analysis is the comparison of two dis- 
tributions through the comparison of histograms. The hypothesis of homo- 
geneity [H is that the two histograms represent random values with identical 
distributions. It is equivalent to there existing m constants pi,...,p m , such 
that Y^LiPi = 1> an d the probability of belonging to the ith bin for some 
measured value in both experiments is equal to Pi. 

Let us denote the number of random events belonging to the ith bin of the 
first and second histograms as nu and n 2 i, respectively. The total number 
of events in the histograms are equal to n 3 - = Y^i=i n jii where j = 1,2. Note 
that over /underflows have to taken into account for these relations to hold. 

It has been shown in Ref. 0] that the statistic 

' w * ~ n ' Pi > (2) 



i=l 



rijp, 



has approximately a Xm-i distribution. For two statistically independent 
histograms with probabilities pi, ...,p m the statistic 



K n ji ~ n jPi) 2 

3=1 1=1 



rijPi 



has approximately a xlm-2 distribution. If the probabilities pi, ...,p m are not 
known, then they can be estimated by the minimization of Eq. (J3]). The 
estimation of Pi is carried out by the following expression: 

ft = =2±2S, (4) 
rii + n 2 

as shown in Ref. [0]. By substituting expression (J4]) in Eq. the statistic 



riji 



rijPi 



1 (n 2 n u - n x n 2i y 

Tl o ^— ' 77.-. • 4- TJ.n • 



EE' 

riira 2 ^ + n 2l 

j=l i=l J i=l 

is obtained. This statistic has approximately a Xm-i distribution because 
m — 1 parameters are estimated The statistic (jSJ) was first developed in 
[i| and is widely used to test the hypothesis of homogeneity. 

Weighted histograms are often obtained as a result of Monte-Carlo simu- 
lations. References 

Baa 

are examples of research on high-energy physics, 
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statistical mechanics, and astrophysics using such histograms. Operations 
with weighted histograms have been realized in contemporary systems for 
data analysis HBOOK [3], Physics Analysis Workstation (PAW) |s} and the 
ROOT framework (if, developed at CERN (European Organization for Nu- 
clear Research, Geneva, Switzerland). 

To define a weighted histogram let us write the probability pi ([I]) for a 
given probability density function p{x) in the form 

Pi— p(x)dx = / w(x)g(x)dx, (6) 
JSi Js t 

where 

w(x) = p(x)/g(x) (7) 

is the weight function and g(x) is some other probability density function. 
The function g(x) must be > for points x, where p(x) ^ 0. The weight 
w(x) = if p(x) = 0, see Ref. Because of the condition Yl%Pi = 1 

further we will call the above defined weights normalized weights as opposite 
to the unnormalized weights w(x) which are w(x) = const ■ w(x). 

The histogram with normalized weights was obtained from a random 
experiment with a probability density function g(x), and the weights of the 
events were calculated according to (|7j). Let us denote the total sum of the 
weights of the events in the ith bin of the histogram with normalized weights 
as 

rii 

Wi = X>i(fc), (8) 

k=l 

where rii is the number of events at bin i and Wi(k) is the weight of the kth 
event in the ith bin. The total number of events in the histogram is equal 
to n — Y^iLi n ii where m is the number of bins. The quantity pi = Wi/n is 
the estimator of pi with the expectation value Epi = p^. Note that in the 
case where g(x) = p(x), the weights of the events are equal to 1 and the 
histogram with normalized weights is the usual histogram with unweighted 
entries. 

Nowadays, the apparatus used for measurements have become more com- 
plex and computers have become more powerful. The final theoretical pre- 
diction of a model is often obtained by Monte Carlo simulation and often 
with the usage of weights for the simulated events. 

Comparison of two weighted histograms, comparison of the weighted his- 
togram and the histogram with unweighted entries, as well as the fitting 
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weights of simulated random events to the experimental histogram are all 
important parts of data analysis. 

The problem of fitting experimental histograms using simulated model 
histograms (weighted histograms) has been discussed in Ref. 11| . Infor- 
mation about the statistical uncertainties of the weighted histograms is not 
used in the fitting algorithm, and the method proposed in Ref. ll| can be 



recommended for use with very high statistics of Monte Carlo simulations. 

Another method that takes into account the statistical errors of simulated 
theoretical prediction was proposed in Ref. 12j for the special case of linear 



superposition of several model distributions produced by a parameter-free 
Monte Carlo simulation. 

On the other hand, a common approach for the comparison of the weighted 
histogram and histograms with unweighted entries was developed in Ref. 
13j| . Unfortunately, the formula (32) for the chi-square test generalization 



on page 633 of Ref. 13J cannot be used for the cases where the histograms 
have different total number of events. To prove this statement, it is sufficient 
to consider the formula for the case of the two histograms with unweighted 
entries. The formula coincides with statistic (|3J) for the case of the two his- 
tograms with equal total number of events, and does not lead to statistic 
(JSj) when the number of events is different. In the same way, it is not diffi- 



cult to prove that all other formulas presented Ref. [13J cannot be used for 
the comparison of histograms with different total number of events; this is a 
serious restriction with respect to the practical application of the proposed 
approach. 

Modified chi-square tests for the comparison of the weighted histograms 



and histograms with unweighted entries were proposed in Ref. [14 , [15 
The proposed tests are available in the ROOT framework under the class 
THl:Chi2Test ' 
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The main disadvantage of these tests is the rather high 
minimal number of events for the bins of the weighted histogram, which is 
equal to 25. In addition, the tests do not work properly if the total number 
of events for one histogram is considerably greater than that for another. 
Among the approaches widely used in practice, the heuristic chi-square 



test presented in Ref. [17| is well known. To test the hypothesis of homo- 



geneity, the author had proposed to normalize the histograms with respect 
to each other and use the statistics 



(9) 
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where Wji,j = l,2;z = l,...,m is the sum of weights in ith bin of jth 
histogram and 



d 2 = | W 2i [s 2 (Wu)/W u + s 2 {W 2i )/W 2l ] n 2 > m 
1 \W li [s 2 {W l i)/W u + s 2 {W 2i )/W^ n 2 <n 1 

where s 2 (Wji) is the total sum of the squares of the weights in the ith bin 
of the jth histogram. The number of events per bin should be > 20. It is 
expected that if the hypothesis of homogeneity is true, then the statistics X 2 
has a chi-square distribution; however, the number of degrees of freedom was 



not clearly defined in Ref . [17( . Note that statistic for the case of the two 
histograms with unweighted entries does not lead to the classical chi-square 
statistic ((Sj). 

Recently, a goodness-of-fit test for the weighted histograms was proposed 



in Ref. [18(. This test is a generalization of Pearson's test for weighted 
histograms and leads to the usual Pearson's test for the case of histograms 
with unweighted entries. In this paper, we have used these results and have 
developed the generalization of the chi-square homogeneity test that, for the 
case wherein the weights of the events in both the histograms are equal to 
1, leads to the usual chi-square test. In addition, important practical tests 
for histograms with unnormalized weights have been developed. It has been 
shown that this new approach presented permits an essential decrease in 
the minimal number of events per bin required when compared with Refs. 



15l . Il7j , and can be applied for the case of different total number of events 
in the histograms. 

The paper is organized as follows. In Section 2, a generalization of the 
chi-square homogeneity test is proposed. The test for the histograms with 
unnormalized weights is discussed in Section 3. Application and verification 
of the tests are demonstrated using numerical examples in Section 4. Fur- 
thermore, the sizes of the tests are compared with the calculated sizes of the 
heuristic chi-square test [l7| for important practical case of histogram with 
unweighted entries (experimental data histogram) and histogram with un- 
normalized weights (simulated data histogram). The comparison has demon- 
strates the superiority of the proposed generalization of the chi-square test 



over the heuristic chi-square test [17 
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2. Homogeneity test for comparison two histograms with normal- 
ized weights 

Let us consider two histograms with normalized weights, and subindex j 
will be used to differentiate them. The total sum of weights of events Wji in 
the ith bin of the j'th histogram j — 1, 2; i — 1, . . . , m can be considered as 
a sum of random variables 

riji 
k=l 

where the number of events riji is also a random value and the weights 
Wji(k), k — 1, ...,riji are independent random variables with the same proba- 
bility distribution function for given bin. Let us introduce a variable 

r ji = Ew ji /Evrj i (12) 

which is the ratio of the first moment to the second moment of the distribu- 
tion of weights in bin i. As shown in Ref. [l8[ the statistic 

1 ^rjiWfi 1 {n j -T, i ^r ji W ji f 
— /> + 1 v^r rij 13 

n 3#Z Pi n i 1 -Z2i^k r 3iPi 

where sums extends over all bins i except one bin k, approximately has a 
Xm-i distribution and is a generalization of the Pearson's statistic (J2J). 

Note that denominator 1 — Y2i^k r jiPi > 0- To P rove this statement let 
us write the probability Pi as 

p i = g ji Ew ji (14) 

then 

i^k ijtk i l i^k 

because following Holder's inequality 

(E Wji ) 2 <Ew%. (16) 



For two statistically independent histograms with probabilities pi, ...,p m 
the statistic 



j2 = fi V ^ + f (17) 
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has approximately a xlm-2 distribution. The probabilities, pi , are not known 
and pi, . . . ,pk-i,pk+i, ■ ■ ■ ,p m can be determined by the minimization of Eq. 
(I17p under constraints 

Pi > 0, 1 - ^p, > 0, 1 - r ^Pi > °> and 1 ~ r2 '^ > °- ( 18 ) 

Subsequently, the homogeneity test statistic will have a Xm-i distribution, 
because m — 1 parameters are estimated Let us now replace r^j with the 
estimate 



J2 w dk)/J2 w U k )- ( 19 ) 

fc=l k=l 

Then, the test statistic is given as 

x^±ij:^i + ± I'y^^-f », (20) 

Note that for the histograms with unweighted entries Wj% = riji, Pi — 
(nij+n 2 j)/(ni+n 2 ), fji = 1, statistic ( 1201) coincides with the chi-square statis- 
tic ([5]). Although the estimators of the probabilities (jl]) for the histograms 
with unweighted entries were found, the common problem of minimization of 
Eq. (120]) to determine the estimators of probabilities pi cannot be effectively 
solved analytically. However, the problem has been solved numerically by 
the coordinate-wise optimization in this paper. For each step, the minimum 
is found for one probability with the others fixed, using the Brent algorithm 



19|, |20j. The rather good initial approximation 



fiiWu + r 2i W 2i , 

Pi = — — 21 

rani + r 2i n 2 

provides a fast convergence of the algorithm to the minimum with an easy 
control under constraints (fl8|) . Formula (120]) for the histograms with un- 
weighted entries does not depend on the choice of the excluded bin; however, 
for the histograms with normalized weights, there can be a dependence. A 
test statistic that is invariant to the choice of the excluded bin can be obtained 
as the median value of formula (1201) for a different choice of the excluded bin 



X 2 = Med {XlXl...,X 2 J. (22) 
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as carried out in Ref. [18|, for the goodness-of-fit test. 

The chi-square approximation is asymptotic. This means that the critical 
values may not be valid if the expected frequencies are too small. The use of 
the chi-square test is inappropriate if any expected frequency is < 1, or if the 



req 

expected frequency is < 5 in > 20% of the bins [2l|, |22J for either histogram. 
This restriction observed in the usual chi-square test is quite reasonable for 
the proposed test. 

Note that for the case Wji = 0, the ratio is undefined. The average 
value of this quantity for its nearest neighbors bins with non-zero bin content 
can be used for an approximation of the undefined fj or the empty bin can 
be merged with the nearest neighbor bin that is not empty. Moreover, the 
last possibility is fj = that makes the test more conservative. 

3. The test for histograms with unnormalized weights 

In practice one is often confronted with the case when a histogram is 
defined up to an unknown normalization constant. Let us denote a bin 
content of histograms with unnormalized weights as Wji, then Wji = WjiCj, 
and the test statistic ffT3"j) can be written as 



Cj ^ mi t i {n 3 - E^ fe (23) 



with fji = CjTji. An estimator C^j for the constant Cj is found in 18] by 
minimization of Eq. (123]) and equal to 



(24) 




i^k V ^i=k J % 3il f ' 1 i^k 



Substituting ( 1241) to the ( 1231) and replacing fji with the estimate fji we get 
the test statistic 

s 2 

— + 2s kj , (25) 
no- 
where 



i^k i^k i^k 
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The estimate fji in (I26p is calculated in the same way as the estimate rji, see 
formula (fT9"|) . 

Statistic (125]) has an approximately Xm-2 distribution. Note that this test, 



as is shown in the numerical examples in Ref. [18[, is liberal; in other words, 
the real size of the test is slightly larger than the nominal value of the test. 
For two statistically independent histograms with probabilities pi, ...,p m , the 
statistic 



s 2 2 



^ = Ef +2£**» (27) 

3=1 J 3=1 

has approximately a xim-3 distribution. One degree of freedom is lost be- 
cause, as mentioned earlier, the goodness-of-fit tests (123]) are liberal and 
this effect accumulates for the two histograms. The probabilities, pi , are 
not known and the estimators Pi, ■ ■ ■ ,pk-i,pk+i, ■ ■ ■ ,Pm can be found by the 
minimization of Eq. ( |27l) . Subsequently, the test statistic 

^ 2 = E^ + 2 I>' (28) 



71, 

3=1 3 3=1 



where 



'E hSi E hiWfJPi - E ?a% (29) 



has Xm-2 distribution. The probabilities pi can be calculated in the same 
way as described in Section 2 with the initial approximation 

hi^iWH + hiEliW* 1 ; 

A test statistic that is "invariant" to the choice of the excluded bin can be 
obtained again as the median value of (12?]) for all possible choices of the 
excluded bin 

l 2 = Med {l^if,...,!^}. (31) 

As a result of the calculations presented in Section 2 and the above-obtained 
results, the test statistics for the comparison of the histogram with normal- 
ized weights and that with unnormalised weights, can be given as 

% = i E gup + 1 <■" - Eg M - ni + a + 24. 02) 
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where 



. 1^2 ^ 2iVi S _ ^"Wa (33) 

ij^k i^k iy^k 



has approximately xL-2 distribution. The probabilities pi can be calculated 
in the same way as described in Section 2, with the initial approximation 

hi Wu + f 2 iW 2i .... 
Pi = f n +^ V" W ■ (34) 

Statistic (|32|) for the very important case of comparing the histogram with 
unweighted entries and the histogram with unnormalized weights can be 
given as 

i m 2 -2 

Statistic ( |35l) can be used to compare the experimental and Monte Carlo 
histograms, as well as for the purpose of fitting the Monte Carlo distribution 
to the data. 



4. Evaluation of the tests' sizes and power 

The hypothesis of homogeneity Hq is rejected if the test statistic X 2 is 
larger than some threshold. The threshold k a for a given nominal size of test 
a can be defined from the equation 

r+oo x l/2-l e -x/2 

a = P (x? > k a ) = ^ 2 ^r(//2) dX ' (36) 

where I = m — 1. 

Let us define the test size a s for a given nominal test size a as the prob- 
ability 

a s = P(X 2 > k a \H ). (37) 

This is the probability that the hypothesis H Q will be rejected if the distribu- 
tion of the weights Wji, j = 1,2; i = 1, m, for the bins of the histograms 
satisfies the hypothesis H . The deviation of the test size from the nominal 
test size is an important test characteristic. 

A second important characteristic of the test is the power (3 

(3 = P{X 2 >k a \H a ). (38) 
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This is the probability that the hypothesis of homogeneity H will be rejected 
if the distributions of the weights Wji, j = 1, 2; i — 1, m, of the compared 
histograms do not satisfy the hypothesis Hq. The same definitions with 
/ = m — 2 in formula (1561) can be used for the test statistic X 2 . 

The following is a numerical example to evaluate the sizes and power of 
the tests. Let us take a distribution 

P{X) K (x - 10)2 + 1 + (s - 14)' + 1 (39) 

defined on the interval [4, 16] and representing two so-called Breit-Wigner 
peaks [iij]. Three cases of the probability density function g(x) are considered 
(see Fig. 1) 

gi{x)=p(x) (40) 

g 2 (x) = 1/12 (41) 

2 2 
(x — 9)^ + 1 (x — 15)^ + 1 

Distribution gi(x) (140!) results in an histogram with unweighted entries, 
while distribution g 2 {x) (141 p is a uniform distribution on the interval [4, 16]. 
Distribution g 3 (x) (1421) has the same form of parametrization as p(x) (I39P . 
but with different values of the parameters. 

The sizes of the tests for histograms with the number of bins equal to 
5 and 20, with different weighted functions, were calculated for a nominal 
value of size a equal to 0.05. Calculations of test sizes a s were carried out 
using the Monte Carlo method based on 10000 runs. The results of the 
calculation for the two histograms with normalized weights, two histograms 
with unnormalized weights, as well as histogram with normalized weights and 
with unnormalized weights one, are presented in Figs. 2-4. Each plot contains 
9 subplots, with 3 superrows of subplots and 3 supercolumns of subplots. A 
subplot represents test sizes for a pair of statistically independent histograms 
with weight functions p(x)/gi(x), p(x)/gj(x), and different total number of 
events. For example, second superrow of subplots presents the sizes for the 
pairs of histograms with weight functions p(x) jg\ [x] , p[x) jg 2 (x) ; p(x) jg 2 (x) , 
p(x)/g2(x) and p(x)/g 3 (x), p(x)/g 2 (x). To make comparison easier, all the 
plots have the same scale (intensity of gray color). 
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Calculations of test sizes a s were carried out using the Monte Carlo 
method, therefore it is reasonable to test the hypothesis Hq : a s = 0.05 
against the alternative Ha '■ oc s 7^ 0.05. For this purpose z statistics can be 



used 21 



nnKW /0.05 X (1 - 0.05) 

z = (a s - 0.05)/a/ (43) 

v ;/ V 10000 v ; 

where a s is the calculated value of a s . If the null hypothesis is true then 

this test statistic has a standard normal distribution with mean value and 

standard deviation 1. For the standard normal distribution, 2.5% of the 

values lie below the critical value 1.959964, and 2.5% lie above 1.959964. 

Therefore, if we are conducting a 2-sided hypothesis test at the 0.05 level of 

significance, we will accept Hq when \z\ < 1.959964 or 0.045728 < a s < 

0.054272. The " • " markers on the plots show regions satisfying hypothesis 



(i) 



Following Ref. 22], a disturbance is regarded as unimportant when a = 
0.05 and the size of test a s lies between 0.04 and 0.06. In this study, we 
defined the size of the test as close to the nominal value if it satisfied the 
above-mentioned criteria. We have only an estimation of a s therefore it is 

(2) 

reasonable to test the hypothesis Hq : 0.04 < a s < 0.06 against alternative 
H a : a s > 0.06 or a s < 0.04. According to Ref. [24j the critical region for 

(2) 

a s to test hypothesis Hq at the 0.05 level of significance is the interval with 
endpoints x\ and X2 that satisfy the system of equations: 

rc 2 -0.04 x 2 -0.06 
/ a.OixoW / O.OGxoST 

V — nrorro — V — rouiro — 

J 4>(x)dx = 0.95; J (f){x)dx = 0.95 (44) 

x 1 -0.04 x 1 -0.0(j 

/ 0.04x0.96" / 0.06xO.'J4~ 
V 10000 V 10000 

where <fi(x) is probability density function of the standard normal distribu- 
tion. System (T4"4"|) has been solved numerically to give us X\ = 0.036777 and 
x 2 = 0.063906 therefore we accept H { Q 2) when 0.036777 < a s < 0.063906. 
The "□" markers on the plots show regions which do not satisfy hypothesis 

Figures 2-4 demonstrate that the test sizes are close to their nominal val- 
ues for the appropriate number of events in the bins of both the histograms. 
Moreover, they are reasonably close to the nominal values if only one his- 
togram has the appropriate number of events. Tests are conservative when 



12 




both the histograms have inappropriate number of events. Markers "o" on 
the plots show regions with an inappropriate number of events, at least in 
one histogram. Tables 1 present the sizes of the test for the comparison of 
the histogram with unweighted entries and the histogram with unnormalized 
weights (135j) . These table correspond to the first supercolumn of the plots 
presented in Figs. 4(a) and 4(b). Table 2 present the sizes of the heuristic 
test for the same case of the histogram with unweighted entries along with 
histogram with unnormalized weights (Q. The degree of freedom of the chi- 



square distribution has not been clearly defined in Ref. 17|, and we chose 
m — 1 that gave the best results. Comparisons of Table 1 with Table 2 showed 
the superiority of the proposed test over the heuristic chi-square test. 
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Figure 2: Sizes of the test for the comparison of two histograms with normalized weights: (a) number of bins m = 5, (b) 
number of bins m = 20. Markers show regions: "o" have inappropriate number of events in the histograms for application 
of the test, " • " satisfy hypothesis a s — 0.05, "□" have a size of test a s that is not close to the nominal size of the test. 
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Figure 3: Sizes of the chi-square test for the comparison of two histograms with vmnormalized weights: (a) number of bins 
m = 5, (b) number of bins m = 20. Markers show regions: "o" have inappropriate number of events in the histograms for 
application of the test, " • " satisfy hypothesis a s = 0.05, "□" have a size of test a s that is not close to the nominal size of 
the test. 
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Figure 4: Sizes of the chi-square test for the comparison histogram with normalized weights and histogram with unnormalized 
weights: (a) number of bins m = 5, (b) number of bins m = 20. Markers show regions: "o" have inappropriate number of 
events in the histograms for application of the test, " • " satisfy hypothesis a s = 0.05, "□" have a size of test alpha s that is 
not close to the nominal size of the test. 



Tabic 1: Sizes of the new test for the comparison of histogram with unweighted entries and histogram with unnormaliscd weights. 
Gray color of cell marks a size of test with inappropriate number of events in the histograms; italic type marks a size of test 
satisfy hypothesis a s = 0.05; bold type marks a size of test that is not close to the nominal size of the test 
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Tabic 2: Sizes of the heuristic test for the comparison of histogram with unweighted entries and histogram with unnormalizcd weights. 
Gray color of cell marks a size of test with inappropriate number of events in the histograms; italic type marks a size of test satisfy 
hypothesis a s = 0.05; bold type marks a size of test that is not close to the nominal size of the test 
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The powers of the tests were investigated for slightly different values of 
the amplitude of the second peak of the specified probability distribution 
function (see Fig. 5): 

Mx)(X (x-?0)* + l + (x-U)* + l - (45) 

The results of these calculations are presented in Figs. 6-8. All the plots 
were designated the same scale to facilitate comparison. It must be noted 
that the powers of the tests for histograms with unnormalized wights(see 
Figs. 7-8) are lower than those of the test for the histogram with normalized 
weights. It can be observed that the powers of the tests for histograms with 
5 bins are greater than those with 20 bins, except for the comparison of the 
two histograms with normalized weights and the function g 3 (x). It can be 
explained that for the case of 20 bins in region where the histograms differ 
we have more detailed information about the shape but it is represented 
by bins with small statistics of events. In addition, the power is large for 
the pairs of histograms that have the function g 3 (x) in one of them because 
more events are generated in region where the histograms differ; this is in 
agreement with the results presented in Ref. [3]. The comparison of the 
powers of the tests for different pairs of the histograms with the powers of 
the tests for the histograms with unweighted entries demonstrated that the 
values of the powers as well as the sizes of the new tests are reasonable. 
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Figure 5: Probability density function p(x) (solid line) and po(x) (dashed line). 
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Figure 6: Powers of the chi-square test for the comparison of two weighted histograms with normalized weights: (a) number 
of bins m = 5, (b) number of bins m = 20. Markers show regions with inappropriate number of events in the histograms for 
application of the test. 



a 



6400 


o 




















O 












o o o 








o 










o o 










3200 


o 































O o 








o 










o 










1600 


o 































o o o 






o 










o 










800 


o 





















g 3 W 






o o o 



















O 


g 3 W 






400 
200 
100 
6400 


o 
o 

O o 


o 





o 


o 


O o 


o 


o 


O 


o 




o 

o o 





o 





o 


o 

~ 


o o o 
O O o 
o o o o 






o 
o 


o 
o 


o 
o 
o 


o o 

O O 


o 
o 


o 
o 


o 
o 


O O 
O O O 
O O o 


o 
o 






o 

O 


O 
O 


o 




















° 










o o o 






o 










O O 








3200 


o 































o o o 






o 










O O 








1600 


o 































O o 








o 










O O 










5-^800 


o 












g 2 w 






o 












o o o 








o 


g 2 w 






O O 










400 
200 


o 
o 


































O o 
o o o 








o 
o 










O O 
O O 










100 
6400 


o 

































O O O 





o 


o 


o 


O O 





o 


o 


O O O 


o 





o 


o 


o 




















° 












o o 








o 










O O 










3200 


o 




















o 












o o o 



















O O 










1600 


o 































o o o 








o 








O O 










800 


o 



























o o oSiW 

















O O 










400 


o 

































o o o o 





o 


o 


o 


O O 


o 


o 


o 


O O O 


o 





o 





200 
100 


o 

0,0,0 


o 





o 


° 


O | O | o 


° 


° 


° 


° 




O | o 





° 





° 





o o o o 

| O | | o 


o 




o 
o 


o 

° 


o 
o 


O O 

O | O | o 


o 

° 


o 

° 


o 

° 


O O O 

O | O | O 


o 

° 


o 




o 

° 


o 

° 




© © o 
© © © 

i-H (N T 


© 
© 

OC 


© 
© 


© 
© 


6400 
100 
200 
400 


© 
© 

00 


© 
© 


© 
© 

m 


6400 
100 
200 


© 
© 


© 
© 

00 


© 
© 


© 
© 

m 


6400 
100 
200 
400 
800 


Q 
a 


e 
e 

IN 


O o 
o o 

T 


o o o 
o o o 
T oo 


e 


e 
e 


6400 
100 
200 
400 


3C 


o 


e 


© 
© 

se 



n. 



n. 



Figure 7: Powers of the chi-square test for the comparison of two histograms with unnormalized weights: (a) number of 
bins m = 5, (b) number of bins m = 20. Markers show regions with inappropriate number of events in the histograms for 
application of the test. 
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Figure 8: Powers of the chi-square test for the comparison of the histogram with normalized weights and the histogram with 
unnormalized weights : (a) number of bins m = 5, (b) number of bins m = 20. Markers show regions with inappropriate 
number of events in the histograms for application of the test. 



5. Conclusions 



In this study, a chi-square homogeneity test for the comparison of his- 
tograms with normalized weights has been proposed. The test is a general- 
ization of the classical homogeneity chi-square test. In addition, a test for 
histograms with unnormalized weights has also been developed. The pro- 
posed tests are very important tools in the application of the Monte Carlo 
method as well as in simulation studies of different phenomena. The evalu- 
ation of the sizes and powers of these tests was carried out numerically for 
histograms with different number of bins, events, and weight functions. The 
same investigation was carried out for the heuristic chi-square test that is 
currently being widely used. Comparison of the results showed the superior- 
ity of the new tests over the heuristic test. The new tests can be used to fit 
the Monte Carlo data to the experimental data, compare the experimental 
data with the Monte Carlo data, compare two Monte Carlo data sets, and 
solve the unfolding problem by reweighting the events. 
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